Lefschetz  Center  for  Dynamical  Systems 


(2 


ROBUSTNESS  AND  APPROXIMATION  OF  ESCAPE 
TIMES  AND  LARGE  DEVIATIONS  ESTIMATES  FOR  SYSTEMS 
WITH  SMALL  NOISE  EFFECTS 


by 


Harold  J.  Kushner 


March  1982 


LCDS  Report  #82-5 


DTVC 

.El-ECTE 

JUN  1  1  1982 

H 


ROBUSTNESS  AND  APPROXIMATION  OF  ESCAPE  TIMES  AND  LARGE 


DEVIATIONS  ESTIMATES  FOR  SYSTEMS  WITH  SMALL  NOISE  EFFECTS 


Harold  J.  Kushner 


Divisions  of  Applied  Mathematics  and  Engineering 
Lefschetz  Center  for  Dynamical  Systems 
Brown  University 
Providence,  Rhode  Island  02912 


March  1982 


Work  supported  in  part  by  the  Air  Force  Office  of  Scientific  Research  under 
AFOSR  81-0116,  by  the  National  Science  Foundation  under  NSF-Eng  77-12946-A02 
and  in  part  by  the  Office  of  Naval  Research  under  N00014-76-C-0279-P0004 . 


ROBUSTNESS  AND  APPROXIMATION  OF  ESCAPE  TIMES  AND  LARGE 


DEVIATIONS  ESTIMATES  FOR  SYSTEMS  WITH  SMALL  NOISE  EFFECTS 

Abstract 

For  the  purposes  of  estimating  escape  time  from  a  given  set,  or  other 
statistical  properties  of  systems  with  small  noise  effects,  it  is  generally 
assumed  in  applications  that  the  system  noise  is  white  Gaussian.  The  Gaussian 
assumption  greatly  simplifies  the  computation,  but  is  not  adequate  for  many 
important  classes  of  applications  to  control  and  communication  theory.  For 
example,  when  the  noise  is  small,  the  mean  escape  time  from  a  set  can  be  quite 
sensitive  to  the  underlying  statistics  even  though  in  the  study  of  the  effects 
of  the  noise  over  any  fixed  finite  time  interval,  the  Gaussian  approximation 
might  be  a  good  one.  This  paper  is  concerned  with  the  sensitivity  of  these 
statistical  quantities  to  the  underlying  statistical  structure,  when  the  noise 
effects  are  small,  and  also  with  the  question  of  when  the  Gaussian  assumption 
makes  sense.  Consider  a  sequence  of  systems  with  small  noise  effects  whose 
statistics  converge  in  some  sense  to  those  of  a  "limit"  system.  The  techniques 
developed  involve  approximation  and  limit  theorems  for  a  sequence  of  variational 
problems  associated  with  the  minimization  of  the  action  functionals  which  arise 
when  the  theory  i?f  large  deviations  is  applied  to  the  above  mentioned  systems. 

The  admissible  paths  and  velocity  fields  are  characterized.  Techniques  are 
developed  for  approximating  e-optimal  or  optimal  paths  and  values  of  the 
action  functionals  with  "restricted  velocity  fields",  and  these  are  used  to 
get  the  desired  limit,  approximation  and  robustness  theorems.  Degenerate  and 
non-degenerate  cases  with  bothbounded  and  Gaussian  noise  are  considered.  Several 
examples  and  an  application  to  a  phase  locked  loop  system  which  arises  in 
communication  theory  are  discussed.  These  indicate  when  the  Gaussian  assumption 
might  be  acceptable  in  practice.  The  results  are  of  potential  use  in  computa¬ 
tion,  for  they  indicate  when  the  results  for  a  simpler  "more  computable"  noise 
process  might  be  a  good  approximation  to  the  results  for  the  true  noise  process. 
The  results  concerning  convergence  and  approximation  seem  to  be  of  independent 
interest  for  treating  convergence  of  the  solutions  of  a  sequence  of  more  general 
variational  problems. 


1.  Introduction 


For  the  purpose  of  estimating  escape  time  from  a  given  set,  or  other 
statistical  properties  of  systems  with  small  noise  effects,  it  is  generally 
assumed  in  applications  that  the  system  noise  is  white  Gaussian.  The  Gaussian 
assumption  greatly  simplifies  the  computation,  but  is  not  adequate  for  many 
important  classes  of  applications  to  control  and  communication  theory.  For 
example,  when  the  noise  is  small,  the  mean  escape  time  from  a  set  can  be  quite 
sensitive  to  the  underlying  statistics  even  though  in  the  study  of  the  effects 
of  the  noise  over  any  fixed  finite  time  interval,  the  Gaussian  approximation 
might  be  a  good  one.  This  paper  is  concerned  with  the  sensitivity  of  these 
statistical  quantities  to  the  underlying  statistical  structure,  when  the  noise 
effects  are  small,  and  also  with  the  question  of  when  the  Gaussian  assumption 
makes  sense.  Consider  a  sequence  of  systems  with  small  noise  effects  whose 
statistics  converge  in  some  sense  to  those  of  a  "limit"  system.  The  techniques 
developed  involve  approximation  and  limit  theorems  for  a  sequence  of  variational 
problems  associated  with  the  minimization  of  the  action  functionals  which  arise 
when  the  theory  of  large  deviations  is  applied  to  the  above  mentioned  systems. 

The  admissible  paths  and  velocity  fields  are  characterized.  Techniques  are 
developed  for  approximating  e-optimal  or  optimal  paths  and  values  of  the 
action  functionals  with  "restricted  velocity  fields",  and  these  are  used  to 
get  the  desired  limit,  approximation  and  robustness  theorems.  Degenerate  and 
non-degenerate  cases  with  both  bounded  and  Gaussian  noise  are  considered. 

Several  examples  and  an  application  to  a  phase  locked  loop  system  which  arises 
in  communication  theory  are  discussed.  These  indicate  when  the  Gaussian  assumption 
might  be  acceptable  in  practice.  The  result:.  are  of  potential  use  in  computa¬ 
tion,  for  they  indicate  when  the  results  for  a  simpler  "more  computable"  noise 
process  might  be  a  good  approximation  to  the  results  for  the  true  noise  process. 
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The  results  concerning  convergence  and  approximation  seem  to  be  of  independent 
interest  for  treating  convergence  of  the  solutions  of  a  sequence  of  more  general 
variational  problems. 

We  will  be  concerned  with  robustness,  approximation  and  applications  of 

large  deviations  methods  [1]  -  [7]  for  processes  of  the  type  (1.1)  -  (1.4). 

The  (£(•)},  (£  )  are  bounded  and  stationary,  w(*)  is  a  standard  Wiener  process, 
n 

{p  )  is  i.i.d.  Gaussian,  and  (p  }  and  w(*)  are  independent  of  {£  }  and  £(•), 
n  n  n 

and  Ep  =  o,  covar  p  =1.  The  functions  a(*),  b(0  and  b(*,  £  )  are  Lipschitz 
n  n 

k 

continuous  (uniformly  in  £) .  In  all  cases,  x  £  R  ,  Euclidean  k-space. 


xY  =  b(x' ,£(t/y)) 
dxY  =  b(xY,£(t/y))dt  +  /ya(xY)dw 
~Y  =  xY  +  yb(xY,£  ) 


(1.1) 

(1.2) 

(1.3) 

l\T*  i\  IV 

(1.4)  *  w(*J)i>k. 

Several  modifications  of  (1.1)  -  (1.4)  will  also  be  considered.  For  (1.3) 

(1.4) ,  define  xY(«)  to  be  the  piecewise  linear  interpolation  of  the  function  with 

/•»»  t  \  w 

Y  —  —  .  1 

values  x'  at  t  =  ny.  Define  b(*)  by  Tb(x)  =  lim  ^  E 

T/y  N 


(N-l)T 

l  b(x,£  ), 
0 


Tb(x)  =  lim  Ey 
y 


b(x,£(t))dt,  and  suppose  that  b  (•)  is  independent  of  T. 


The  various  assumptions  introduced  below  are  not  always  used  together. 

Let  G  be  a  bounded  open  set  with  a  piecewise  differentiable  boundary,  define 
N£  (G)  =  Gx  an  neighborhood  of  G  (henceforth  fixed), and  assume 

(Al.l)  x  «  b(x)  has  a  unique  stable  point  xQ  in^  G1  and  all  trajectories 
originating  in  Gj  tend  to  xQ.  Also,  these  trajectories  are  never  tangent 
to  3G. 

Define  the  H  -functionals 
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i  i  n-i 

H(x.a)  =1  lim  ^  ^  log  E  exp  l  a'(b(x,  £n)  +  cr(x)pn) 

N  ,M*»  M 

(1.5) 

i  rTi/y 

H(x,a)  =  =•  lim  y  log  E  exp  a' [b(x,C(s))ds  +  o(x)dw(s)]. 

y-*0  JTq/y 

Where  T,  -  T  =  T,  Ny  =  T, ,  My  =  T  ,  and  we  assume  that  the  limit  does  not 
1  0  1  0 

depend  on  T.  or  T  .  When  we  wish  to  emphasize  the  Gaussian  component,  an  affix  o 
will  be  used  and  we  write  Ha(x,ot)  =  H°(x,a)  +  Ha(x,a). 


(A1.2)  Iji  (1.5),  let  the  convergence  be  uniform  for  xeG^  and  also  in 
the  initial  data,  if  E  is  replaced  by  the  expectation  given  the  £(•) ,  or 
{£n>  data  up  to  time  or  T^/y  (discrete  parameter  case) . 


The  limits  in  (1.5)  and  assumption  (A1.2)  are  phrased  as  they  are 
because  we  wish  to  treat  the  escape  time  problem  when  the  noise  is  not 
necessarily  Markov.  If  the  noise  is  Markov,  it  is  sufficient  to  set 
Tq=M=0  and  let  the  convergence  alluded  to  in  (A1.2)  be  uniform  in  x  and 
in  £q.  the  initial  state  for  the  noise  process.  We  will  also  sometimes  use 
the  weaker  form 

A(1.2')  Let  Tq=M=0  and  let  the  convergence  in  (1.5)  be  uniform  in 
x  €  Gj. 


Define  the  Cramer  transformation  L(x,B)  =  sup[6'a-H(x,a) ] ,  and  set 

a 


U(x)  =  (B:L(x,g)  <  «  }.  Define  S(T 


.« ■  i; 


L(*(s),<Ks))ds,  if  $  (•)  is 


absolutely  continuous,  and  set  it  equal  to  00  otherwise.  For  T(4>)  = 
inf(t:<Kt)/  G),  define  S($)  =  S(T($),<fr),  SQ  (x)  =  inf{S($)  :♦«>)  =  x), 

so  85  VV  and  set  tg  =  min{t:  xY(t^  *  G}’  The  functional  S(T'^  is 

called  an  action  functional  if  for  each  a  >  0,  h  >  0  and  bounded  continuous 


$(•),  there  is  a  y^  >  0  such  that  for  y  <_  y^, 


(1.6a) 


P(d(xY,<j>)  <  6}  >_  exp  -  (S(T,<(>)  +  h)/y 


(1.6b)  P{d(xY,$  )  >  6}  £  exp  -  (a-h)/y, 

where  d(»,»)  =  sup  norm  distance,  and  4  =  (bounded  continuous  <f>(*); 

S(T,4>)  <  a).  See  [5],  Theorem  2.1,  where  S(T,<f>)  is  written  SQ  T(d>) .  Also 

[5],  (1.6)  implies  that  for  any  set  A  of  continuous  functions  on  [0,T], 

(1.7)  -  inf  S(T,<}>)  £  lim  y  log  P(xY(*)e  A) 

<MA°  Y 

£  lim  y  log  P(xY(0  e  A)  £  -  inf  S(T,<{>), 

Y  4>€  A 

where  A^  =  interior  of  A,  and  A  =  closure  of  A. 

Under  broad  conditions 


(1.8)  lim  y  log  Ety  =  SQ 

See  [3,5].  In  [5] ,  o  =  0  and  (A1.2)  was  implicitly  assumed.  With  o  =  0, 
the  proof  in  [5]  (Theorem  5.1)  is  valid  for  more  general  £(•),  provided 
(A1.2),  (A1.3)  hold,  and  using  the  convergence  of  (1.5)  uniformly  in  x  £  G; 

and  in  the  initial  data.  It  can  also  be  extended  to  cover  a  =  constant  (see 

appendix).  The  case  where  a  i  0,  but  £(•)  does  not  appear,  is  in  [3]. 

The  proof  in  [5J  was  given  for  the  continuous  parameter  problem,  but  it  also 

works  for  the  discrete  parameter  problem.  Criteria  for  (A1.3)  appear  in  theorems 
3.8  and  3.9. 

(A1.3)  For  x  e  Gj,  H(*,*)  is  continuous  and  H(x,*)  is  differentiable. 

(A. 4)  The  boundary  8G  is  piecewise  differentiable.  In  particular,  for 
each  x  e  3G,  there  is  a  neighborhood  N(x)  and  a  finite  number  of  dif¬ 
ferentiable  functions  0^(*).  i  £  k,  such  that  G  n  N(x)  =  (y:  0^(y)  <  0, 

i  *  1,  ...,k),  9G  n  N(x)  =  (y:  0^(y)  £  0,  i  =  l,...,k,  and  some  0^(y)  =  0). 

In  Theorems  3.11  and  4.7,  the  'continuity'  condition  A1.5  will  be  used. 
For  open  Q  containing  xQ  define  S(Q)  =  inf  (S(T,4>):  4>(0)  =  xQ,  $(T)  €  3Q). 
Then  S(Q)  is  lower  semi -continuous  in  Q,  in  that  if  4  Q,  then 
lirc  5(0^)  £  S(Q).  But  it  is  continuous  at  'most'  Q  in  the  following  sense. 
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Let  Qp  =  Np(Q)  and  p  <  pj  with  S(Qp  )  <  °°.  Then  for  all  but  a  countable 

number  of  PQ  <  pi  •  S(Qp)  '*  S^p  )  as  p  +  pQ. 

(A1.5)  S(G  )  is  continuous  at  p  =  0. 

P 

The  quantity  (1.8)  is  of  considerable  importance  in  numerous  applications 
in  control  and  communication  theory,  and  in  various  applications  to  stochastic 
approximation,  particularly  in  estimating  escape  times  from  regions  in  which 
an  algorithm  or  process  has  a  'stability'  property.  Normally  such  estimates 
are  hard  to  get  unless  e  is  small.  Except  for  the  purely  Gaussian  case,  it 
is  now  almost  impossible  to  calculate  H(*,*),  L(.»,*)  or  SQ,  and  so  the 
purely  Gaussian  model  is  used  almost  exclusively.  However,  the  value  of  SQ 
can  be  quite  sensitive  to  the  underlying  statistical  assumptions,  and  it  is 
not  normally  satisfactory  to  use  a  'local  diffusion  or  Gaussian'  approxima¬ 
tion  115],  We  study  the  problem  of  robustness  and  approximatability  for  such 
problems. 

Section  2  contains  a  brief  formal  remark  on  approximatability  by  a 

Gaussian  system.  Section  3  contains  various  background  results  concerning 

smoothness  of  H(*,*)  and  the  admissible  'velocity  fields'  for  the  variational 

problem  associated  with  getting  or  the  estimate  in  (1.7).  The  main 

approximation  results  appear  in  Section  4.  Some  examples  are  discussed 

in  Section  5.  Section  6  discusses  an  application  to  a  phase  locked  loop, 
and  various  problems  which  arise  in  connection  with  that  application. 

This  class  of  applications  seems  to  be  both  natural  and  of  increasing 
popularity  for  the  applications  of  large  deviations  or  singular  perturbation 
type  (partial  differential  equation  based)  methods.  Since  the  physical  noise 
in  such  systems  is  not  white  Gaussian,  or  even  Gaussian  at  all  (strictly  speaking), 
that  application  provides  a  good  example  of  the  role  of  our  results. 

In  the  appendix,  there  are  some  remarks  concerning  extending  the  proof  in 
[2,5]  that  S(T,$)  is  an  action  functional,  to  the  composite  cases  (1.2)  and  (1.4), 
where  both  Gaussian  and  non-Gaussian  noise  appear. 
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2.  A  Comment  on  a  Gaussian  Approximation 


The  expression  a'b(x)  +  j  a'a(x)a' (x)ot  =  Ha(x,a)  is  the  H-functional 


for  the  system 

(2.1) 


x  ,  =  xY  yb(xY)  +  ycr(xY)p  . 
n+1  n  '  n  n  n 


N-l 


1  1 

The  gradient  and  Hessian  (with  respect  to  a  at  a=0)  of  H  =—  log  E  exp  a'  £  b(x,£ 

N  N  Q  n 

are 


N-l 


(2.2a) 

(2.2b) 


H^Jx.O)  =  E  l  b(x,Cn)/N  =  bN(x)  +  b(x) 


N-l 


N-l 


Y™(x’0)  0>(x,tn)-bN(x))  (b(x,5n)-bN(x))’ 


[(x). 


Both  H^x,*)  and  H(x,*)  are  convex. 

Let  $(•)  be  absolutely  continuous.  Then,  under  broad  conditions,  the 

piecewise  constant  (constant  on  [ny,  ny  +  y))  function  which  has  values 
N 

/y  £  (b(<Kiy),  5 . ) -b («J» (iy) ) )  at  t  =  Ny,  converges  weakly  to  a  Wiener 

i*0  1 

t 

process  with  zero  mean  and  covariance  7  (<Ks))ds.  This  suggests  that  a 

j0 

suitably  interpolated  (1.3)  can  be  approximated  by  a  'small  noise’  diffusion 
of  the  form  dy  =  b  dt  +  /y  Jl  dw.  But  such  an  approximation  is  purely 

formal,  and  is  not  usually  valid  in  the  sense  of  approximation  of  the  large 
deviations  results.  Suppose  that  $(•)  is  an  optimizing  (or  nearly  so)  path 
for  the  S(«)  of  (1.3).  If 
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sup[a'$(t)  -  H(*(t),a)]  ~  L(<Kt),tf(t))  • 
aeQ 

for  almost  all  t,  where  Q  is  a  set  where  the  quadratic  approximation 
a'b(x)  +  ct'  l  (x)a/2  to  the  H-functional  for  (1.3)  determined  by  (2.2a,b)  is 
acceptable,  then  the  Gaussian  approximation  makes  sense,  but  this  is  normally 
very  difficult  to  verify. 

3.  Preliminary  Results 

Theorems  3.1  to  3.4  give  necessary  and/or  sufficient  conditions  for 

3  to  be  in  U(x)  in  terms  of  the  underlying  statistics.  This  is  important, 

since  when  minimizing  S(4>)  or  S(T,4>),  we  have  4>(t)  €  U(4>(t)),  and  the 

questions  of  finiteness  and  approximatabilitv  of  S  and  inf  S(T,<J>)  are 

U  <P€A 

related  to  the  properties  of  U(x).  Theorem  3.5  and  Corollary  3.6  provide  continuity 
and  convergence  results  which  will  be  useful  in  the  sequel,  and  Theorems  3.8 
and  3.9  provide  criteria  for  (A1.3). 

Theorem  3.1.  H(x,«),  L(x,*)  and  U(x)  are  convex.  L(*,0  and  U(-) 

(in  the  Hausdorff  topology)  are  lower  semicontinuous . 

If  a(x)a'(x)  is  uniformly  positive  definite  on  G^,  then  L(x,g)  <  », 
all  x  c  G1  and  all  B.  Remarks  on.  the  degenerate  case  are  given  below. 

Theorem  3.2  ((Cn)  i.i.d,  a  =  0.)  Let  K  =  have  compact  support.  Then 

H(x,ct)  =  log  E  exp  a'b(x.C)  and 

(a)  L(x,6)  *  »  if  0  i  co  range  b(x,£)  =  C 

(b)  L(x,0)  <  »  vf  0c  rel.int.  co  range  b(x,C). 

Note.  0  e  range  b(x,C)  if  for  each  e  >  0,  P{b(x,C)  c  (6) }  >  0.  The 
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relative  interior  is  relative  to  the  smallest  linear  manifold  which  contains 
the  set. 


Proof. 

(a)  Let  gQ  i  C.  Then  there  are  £,cQ  such  that  £'b  <  cQ)  b  £  C, 
£'BQ  >  c.  Also, 

sup  [a' 8  -  H(x,a)]  >_  supfcJl'8-  -  log  E  exp  c£'b(x,£)] 

a  0  c>0 

>_  sup[c£* 0q  -  ccQ]  =  “. 


(b)  Let  8q  be  such  that  P{b(x,£)  c  N(B0))  >  0  for  all  neighborhoods 
N(8q)  (in  the  smallest  linear  manifold  containing  C)  of  Bq-  For  convenience, 

define  M(0Q)  =  (£:b(x,£)  €  N(80)>,  and  PMCSQ3  =  P{M(BQ)}. 

Then  (Jensen's  inequality  is  used  to  get  the  last  line) 


sup[a'  6-H(x,o)  ]  <_  sup[a'  6-log  pm(Bq) 


exp  a '  b  (x ,  £)  -p-  r r  ] 

M(60)  MlV 


dP 


1  sup  [a'B-a'J  b(x,C)  p  rg  J  “  lo8  ‘ 

a  M(PqJ  m  v 


Thus,  for  3 


M(gn) 


b(x,£)  dP/PM(B0),  L(x,B)  <  “.  The  assertion  (b) 


follows  from  this.  Theorem  3.1  and  the  fact  that  a  convex  set  is  a  contin¬ 
uous  function  (Hausdorff  topology)  of  its  extreme  points.  Q.E.D. 

The  general  case  (1.3)  or  (1.1)  for  non  i.i.d.  (C  }  is  more  complicated. 

We  concentrate  on  (1.3),  and  first  treat  the  finite  Markov  chain  case. 
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Theorem  3.3.  Let  (Cn)  be  a  finite  state  Markov  chain  with  state  space 
D,  and  transition  probabilities  {p^j  },  and  with  all  states  communicating 
with  each  other.  Then  U(x)  is  the  set  of  8,,  such  that  there  are  N  -*■  ® 


and  z^ 


(3.1) 


such  that  p  >  0  all  i  and 

-  rz.z  -  - 

1  1  +  1 

N  -1 
n 

60  =  lim  fT  2  b(x,z  ). 

n  n  i=0 


Proof.  By  the  ergodicity,  such  8^  form  a  closed  convex  set.  Let  {z^}, 

satisfy  the  hypothesis  and  define  Bq  by  (3.1).  There  is  a  q  >  0  such  that 

p  >  q,  all  i.  Then  (the  limit  below  exists  by  the  discrete  parameter 

Z  .  Z  .  , 

1  1+1 

version  of  Theorem  2.2  of  [5];  see  also  Theorem  3.8  below). 


N-l 


1  A 

sup [a' 8q  -  lim  ^  log  E  exp  a'  l  b(x,£.)] 
a  N  0 


N  -1 
n 


<_  sup[a'B0  -  lim  ^  log  E  exp  a?  £  b(x,?i)] 


n  n 


N  -1 
n 


N  -1 
n 


<  sup  [a '  8  -  l_im  j~-  log(exp  a'  £  b(x,z.))  II  p. 

<*  n  n  0  i  =  l 


.  .  z.  , z. 
1=1  i-l  l 


1  -  log  q- 


Thus  6q  e  U(x).  The  reverse  case  can  be  proved  by  a  method  similar  to  that 
of  Theorem  3.2  (a).  (See  also  Theorem  3.4.)  Q.E.D. 
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Note  that  we  also  proved  that  L(x,0  <_  -  log  q  on  U(x).  We  now  move 

to  the  general  case  (1.3).  Put  8^  c  Uq(x)  if  for  each  e  >  0,  there  are 

q  >  0,  N  -+  «  and  {z.}  such  that  (3.1)  holds  and  P{b(x,£.)  e  N  (b(x,z.)), 
n  i  l  e  l 

i  n 
l  <  n  }  >  q  . 

—  —  e 

Theorem  3.4.  Let  the  each  have  support  in  a  compact  set  C^.  Then 

relative  interior  co  UQ(x)  c  u(x)-  Define 

— 

R  =  co  range  [  £  b(x,£.)/k]. 

K  0 

Suppose  that  there  is  a  6  >  0  such  that  the  distance  between  BQ  and 
is  >6  for  an  infinite  number  of  k.  Then  8q  l  U(x). 

Proof.  The  proof  of  the  first  assertion  is  similar  to  that  of  Theorem  3.3. 

To  prove  the  second  part  take  a  subsequence  of  R^  (also  indexed  byk)and  suppose  that 

d (3o»^]c)^.<^>0  for  all  kfw.l.o.g.).  Note  that  there  are  eq>0  and  bounded  {c^l 

f  t 

and  unit  vectors  such  that  i^b  <  c^  for  b  e  R^  and  >  c^  +  e^. 

Assume  that  -*■  £  and  c^  -*■  c  (or  else  work  with  a  convergent  subsequence). 
Then 

1  k_1 

sup[a'8n  -  H(x,a)]  sup[c£'6n  -  lim  r-  E  exp  k  c£'  £  b(x,£.)/k] 

a  °  c>0  U  k  0  1 

^sup[c1'Bq  -  cCp  ]  =».  Q.E.D. 

clO 

There  is  an  obvious  continuous  parameter  analog  with  an  integral 
replacing  the  sum  and  (for  measurable  z ( • ) )  P{b(x,C(s))  <  N  (b(x , z (s) ) ) , 

s  >  t  }  >  exp  -  tq£,  q^  >  0,  replacing  the  analogous  condition  above. 
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The  continuity  of  L(.,.)- 

Theorem  3.5  Let  H(»,.)  be  continuous  and  let  xo’^o’Nc^o^  satisfy 

L(x_,8)  <  ®  for  $  e  N  (Bn) .  Then  L(>,.)  is  continuous  at  (x-.B,.). 
u  -  e  u  - - ■ -  u  u 


Proof.  By  Theorem  3.1,  H(x,*)  is  convex.  By  the  hypothesis  and  the 

concavity  (in  a)  of  ct'B  -  H(x,a),  the  set  of  maximizing  a  (at  xq>Bq) 

is  bounded.  Otherwise,  for  an  appropriate  but  arbitrarily  small  68  we 
would  get  L(xq,80  +  66^  =  Also  a'B  -  H(x,a)  -*•  a'Bp  -  H(xQ,a0) 

uniformly  on  bounded  a-sets  as  (x,8)  -*■  (x0,8Q).  The  concavity  and  the 

last  three  sentences  imply  that  the  set  of  maximizing  a  in  supfa'B  -  H(x,a)] 

a 

must  also  converge  to  the  set  of  maximizing  a  for  x0,BQ.  Thus  L(x,B)  -*• 
l(x0,80).  Q.E.D. 


A  remark  on  the  degenerate  case. 

Suppose  that  (1.4)  has  the  form  (x  =  (x^x,),  a  =  (a^.a^),  B  =  (B^.B^ 


(3.2) 


*  Ybl(\>  •  Xl€  Rk_t-  *2€  R'  > 


X2,k*l  ‘  *2,k  *  yb2(VV  *  yV’!k,V 


Then 

(3.3)  H(x,a)  =  aJb1(x)  +  *2o 2(x)o2(x)a2/2  +  lim  ^  log  E  exp  <*2  £  b2(x,  ^n) , 

N  0 

and  L(x,B)  =  00  if  8^  ^  b^(x).  But  Theorem  3,5  can  be  used  to  study  continuity 

_  i 

with  respect  to  $2  when  8j  =  bj(x).  If  o2(x)o2(x)  is  uniformly  positive 
definite  and  «(*,•)  is  continuous,  then  L(x,B)  is  continuous  in  (x.e.,) 
when  =  b^ (x) .  Similar  remarks  hold  in  the  continuous  parameter  case.  Define 
U2(x)  =  (02 :  L(x;b^(x) ,62)  <  °°}  .  In  the  sequel,  when  we  rc^er  to  the  de¬ 

generate  case,  the  form  (3.2)  is  always  intended.  In  the  non-degenerate  case,  we 
assume  that  U(x)  has  a  non-empty  interior,  and  in  the  degenerate  case  that  U7(x) 
has  a  non-empty  interior. 


12 


_ £  _ 

In  the  non-degenerate  case,  define  U  (x)  by  :  3€  U  (x)  if 

3  €  U(x)  and  d(0,3U(x))  6,  where  d  =  Euclidean  distance.  The  set 

— <5  —“6 

U  (x)  is  called  a  '6-interior'  set.  Let  U?  (x)  denote  the  6-interior 

—6  x 

set  for  l^W,  and  in  the  degenerate  case,  define  U  (x)  by  8  E  L'  (x) 

_  _ ^ 

if  0  =  =  b^(x)  and  $2  ^  U2fx).  The  continuity  of  a  set  valued 

function  is  always  in  the  Hausdoff  topology. 

An  argument  similar  to  that  of  Theorem  3.5  proves  the  following. 

Corollary  3.6.  Let  U(*)  and  H(« ,  •)  be  continuous  and  let  H^Cx^)  -+  H(x,a) 
uniformly  on  bounded  (x,a)  sets,  where  and  H  are  H-functionals . 

Then,  in  the  non-degenerate  case  and  for  any  compact  set  K  and  6  >  0, 

Ln(x,3)  -*■  L(x, 6)  uniformly  on  {x , 6 :  x  t  K,  6  c  U^O)}  h  iT5.  In  the  de¬ 
generate  case ,  let  Ti^O)  denote  the  'mean 'dynamics  for  the  system  yielding 

_  _  _ r 

Hn’  ^hen  Ln(» ;bjn(») **)  L(*;bj  (*)  » *)  uniformly  on  {x,  &2  :x  €  K,  &2  C  U2(x)}  . 

If  U(* )  is  not  continuous,  the  convergence  holds  but  might  not  be  uniform. 

The  proof  of  the  following  'path  approximation'  theorem  is  omitted. 
d(*,«)  is  the  sup  norm  distance. 

Theorem  3 . 7.  Let  (!(•)  be  Lipschitz  continuous  for  x  €  G^,  and  suppose  that  there 
is  a  6j  >0  such  that  U  1 (x)  is  non-empty  for  all  x  e  G^.  Given  e  >  0 
and  <{>(•)  such  that  <J>(t)  •  G^  and  $(t)  .  U C«J> (t ) ) ,  t  <  T, 


there  arc 


-13- 


6  6 

Eg  >  0,  5  >  0  and  absolutely  continuous  4>  (•)  such  that  d(<J>,<|>  )  <_  e, 

d($,$5)  £  eQ,  ^(t)  e  U^Ot*5  (t)) ,  t  <  T,  and  E()  -*•  0  as  e  -*•  0. 

Smoothness  of  H  (»,•). 

In  Theorems  3.8  and  3.9,  we  stick  to  the  discrete  parameter  Markov 
chain  case  (compact  state  space  D)  and  use  a  clever  method  of  Freidlin  [5] 
to  slightly  extend  his  results.  Analogs  of  these  theorems  for  the  non- 
Markov  case  would  be  quite  useful.  C(D)  denotes  the  space  of  continuous 
functions  on  D  endowed  with  the  sup  norm  topology.  Define  the  operator 
Q(x,a)  :  C(D)  -*■  C(D)  by  (use  E  =  £Q) 

(3.4)  Q(x,a)f(0  =  E^fC^)  exp a'b (x,^) , 

where  f(«)  c  C(D).  For  m  £  1,  let  j|  Qm(x,a)  ||  =  Xm(x,a)  denote  the 

operator  norm.  Henceforth  x  e  G^,  and  B  is  a  compact  (x,ci)-set.  Theorems  3.8 
and  3.9  give  conditions  under  which  (A1.3)  holds. 

Theorem  3.8.  Let  there  be  an  m  such  that  Qm(x,a)  is  compact  for  each 
(x,a)€  B. Suppose  that  Qm(x,a)f(£)  >  0  for  all  E  e  D  i_f  0  ^  f(»)  e  C(D) 
and  f(£)  >_  0.  Then  (x,a)  is  an  isolated  eigenvalue  (with  a  one  dimen¬ 

sional  eigenspace)  and  the  corresponding  eigenvector  e  (x,a,*)  satisfies 

(3.5)  inf  inf  e  (x,ot,£)  e  5n  >  0. 

B  E  m 

Also 

(3.6)  H(x,ot)  =  ^  log  Am(x,a). 

The  convergence  defining  H(x,a)  is  uniform  on  B  and  in  the  initial  data 
and  H ( « , ♦ )  is  continuous. 

m  m 

Remark.  Q  (x.a)f(^)  =  E^f(^m)exp  a'  l  b(x,£k). 


Proof. 


The  continuity  of  A^fx.a)  is  obvious.  The  rest  of  the  proof  is  a 

slight  modification  of  [5,  Theorem  2.2],  Write  (x,ct)  =  y.  By  Karlin  [7], 

the  compactness  and  strict  positivity  imply  that  *m(y)  is  an  isolated  eigen 

value  (and  has  a  one  dimensional  eigenspace),  and  from  this  it  is  not  hard 

to  show  that  the  corresponding  eigenvector  em(y>*)  is  strictly  positive. 

We  suppose  w.l.o.g.,  that  sup  em(y,£)  =  1.  Next  we  prove  continuity 

'of  Let  yn  ■+  y.  The  set  {  Qlh(y)em(yn,  •)  >  n  ^  1}  lies  in  a  compact 

set.  Take  a  convergent  subsequence,  indexed  by  n,  with  limit  f ( - ) .  Also 

[Qm(yn)-Qm(y)]em(yn,  •)  and  hence  [A^y^e^y^,  • ) -f  (•)  ]  converge  uniformly 

(in  £)to  zero.  Since  A  (y  )  ■*  A  (y) ,  e  (y  ,•)  converges  uniformly  (in  £) 

mn  m  m  n 

to  f(0/Am(y),  which  must  be  equal  to  em(y»*)  by  uniqueness  of  the  eigen 
vector.  Thus  since  is  continuous  in  £;  for  each  y,  it  is  continuous 

in  (y,S).  This  and  the  strict  positivity  of  Q"(y)  and  of  Qk(y)em(y, •  ) ,  k<  m; 
for  each  yeB  implies  (3.5). 

Now  let  y  e  compact  B.  By  the  above  results,  there  is  a  6q >  0  such 
that  for  each  5  c  D 

(3.7)  xJJ(y)Qk(y)em(y,£)  =  Q^MeJy,  C)  1  Q^^fyJKO 

i^Qmn+k(y)era(y,S)  =  xJJ(y)Qk(y)e  (y,C). 

0  0 

Since 

(3.8)  H(y)  =  lim  i  log  Q^(y)l(C), 

l  * 

(3.5)  and  (3.7)  imply  (3.6)  and  that  the  limit  is  uniform  in  y  e  B,  (  <  D. 


Q.K.D. 
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Theorem  3.9.  Let  A^fx.cO  =||  Qm(x,a)  ||  be  an  isolated  eigenvalue  of  Qm(x,a) 
with  a  one  dimensional  eigenspace  for  each  x,q.  Then  Am(x,«)  is  differ¬ 
entiable  for  each  x.  (We  do  not  use  the  compactness  or  positivity  here.) 

Proof.  As  noted  in  [5],  this  type  of  result  essentially  follows  from  Kato 

[8].  For  each  a  in  some  open  set  A^,  let  T(a)  be  an  operator  in  C(D),  and 

when  a  =  a^,  let  S(aQ)  be  an  isolated  eigenvalue  with  a  one-dimensional 

eigenspace.  Suppose  that  ||  T(a)  -  T Ccxq)  |]  0  as  a  -*■  aQ.  We  can  then  choose  eigenvalues 

£(a)  of  T(a)  such  that  £(a)  -*-£(0^  as  a  ■+  aQ  [8,  p.  213].  Let  A  >  0  be 

such  that  the  distance  between  (SCoi^))  and  the  {spectrum  of  T(aQ)  minus 

5(aQ)}  is  at  least  A.  Define  r  =  N2A^a0^  -  ) .  Then(Kato,  [8], 

p.  208,  Theorem  3.1,  remark  3.2  and  proof)  there  is  a  C  >  0  such  that  if 

||  T(«)  -  T(on)||  <  C  min  ||  RfC)  ||  *\  where  R(C)  =  (T(a  )  -  SI)'1,  then  T(a) 

CeT  u 

has  no  eigenvalues  in  r  ,  but  i(a)  e  NA(C(aQ)).  Since  ([9],  VII  3.3),  d(C)  >_ 

II  RCO  ||  * ,  where  d(£)  =  distance  (S,  spectrum  of  T(a0),  we  find  that  if 

||  T(a)  -  T(“q)  II  £  CA,  then  1 4(a)  -  5(aQ)  |  <_  A,  for  small  A>0. 

Fix  x  and  define  the  operator  T.(x,ct)  in  C(D)  by  T.  (x,a)  f(£)  = 

in 

Erf(5  )  I  b. (x,£,  )  exp  a'  £  b(x,S.),  where  b.(*,*)  is  the  i  component  of  b(*,*)» 

m  1  k  ]._j  *  1 

and  let  a  =  (a.,  .  .  .).  Let  A  (x,6a,an)  denote  the  eigenvalue  of 
a  m  u 

(^(x.Oq)  +  £  6a^T^(x,  aQ)  which  converges  to  A^(x,a0)  as  6a  -*•  0.  By  the 

i 

first  paragraph  and  a  truncated  Taylor  expansion  of  Q^Cx.Oq  +6a)  in  6a, 

Xm(x,aQ  +  6a)  differs  from  Am(x,6a,  a^)  by  o ( | 6a | ) ,  where  o(|6a|)  is  uniform 
in  Oq  in  any  bounded  set. Thus  it  is  not  enough  to  prove  differentiability  of 
Am(x,6a,a0).  But  this  differentiability  follows  from  the  expansion  (eqn. 

(2.17),  p.  446  [8])  and  the  continuity  in  a  of  T.(x,’)  and  Qm(x,*)  and 
we  omit  the  details. 


Q.E.D. 
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Theorem  3.10.  For  each  x ,  let  the  H-functional  H (x , . )  be  differentiable 
at  a  -  0,  and  let  K  be  compact .  For  each  <$  >  0,  there  is  an  e  >  0 
such  that  L(x,B)  ^  e  for  |b  -  b(x)  [  ^  S,  x  e  K. 


Proof.  By  using  a'(B-b(x))  -  H(x,a),  where  H  is  the  H-functional  for 

dynamics  b(x,5)  =  b(x,£)  -b(x),  we  can  assume  that  b(x)  =  0.  Fix  6>  0. 

1  N-i  i  E  N-1 

Note  —  log  E  exp  a'  £  b(x,£.)  >_  rr  log  exp  Not'  tr  £b(x,£;.)-*  0  as  N  °° 

N  Q  1  N  N  Q  1 

since  b(x)  =  0.  Hence  H(x,a)  >_  0.  Suppose  there  are  x^  xQ,  &n  -*•  Bq» 
x^  e  K  such  that  E(xn>Bn)  ®  and  1 3  |  2l  By  lower  semicontinuity, 

lim  L(xn,Bn)  L(x0,Bq),  and  J 6Q  |  >_  6.  By  the  convexity  and  non-negativity 
n 

of  L(x, •) ,  L(xQ,6)  =0  for  6  e  [O.BqJ.  Thus  H(x0>a)  >_  B'a  for  B  e  [0,BQ] . 

This,  H(Xq,0)  =  0  and  H(x,a)  >_  0  contradict  the  differentiability  at 
a  =  0. 


Q.E.D. 


Theorem  3.11.  Let  SQ  <  <*>.  Then  under  (Al.l)  to  (A1.5), 

lim  Y  log  Et^  <  S  . 

..  .n  b  —  U 


(For  (1.2)  and  (1.4))  set  q(«)  =  constant.)  Let  (A1.2)  hold  when  is 
replaced  by  a  stopping  time  t  and  b^  x  +  T.  Then 


lim 

Y*K) 


Y  log  Ex^  SQ. 


With  the  use  of  the  assumptions  concerning  uniform  covergence  of  the  H- 
functional,  the  proof  is  essentially  the  same  as  that  of  Lemma  1  in  [3]. 
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The  uniform  covergence  is  important  in  order  to  get  estimates  (of  the 
probability  of  the  events  used  in  Lemma  1  of  [3])  uniformly  in  the  condi¬ 
tioning  data,  since  we  do  not  necessarily  have  a  Markovian  set  up.  The 
proof  in  [3]  implicitly  assumes  the  continuity  of  Sq(x)  at  x^.  But,  under 
our  conditions  this  holds  by  essentially  the  same  proof  as  used  in  Theorem  5.1 
of  [5]  (with  our  (A1.5)  and  SQ  <  <=°  replacing  (5.1)  of  [5]).  Condition  (A1.5) 
can  be  replaced  by  the  controllability  condition  (A4.7)  and  (bounded  U (x) 
case),  the  existence  of  an  e-optimal  path  satisfying  the  requirements  of 
Theorem  4.7.  In  fact,  these  conditions  imply  Ai.5.  Allowing  for  degenerate  (see 
(3.2))  b(*,*)  and  non-Markov  noise  is  important  in  applications. 
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4.  Approximating  U(x),S(T,A)  and  SQ. 

Lemma  4.1  and  Theorems  4.2,3  show  that  if  H  -*•  H,  <J>  -*•  $  ,  then 

n  n 

lia>  sn(^n)  —  S($),  a  basic  result  for  the  general  approximation  results, 
n 

Theorems  4.4  to  4.8  show  that  S  ■*  S  if  H  •*  H  and  some  other  conditions 

n  n 

hold.  Theorem  4.9  gives  approximation  results  for  inequality  (1.7),  when 
■*  H.  Many  of  the  auxiliary  and  intermediate  approximations  and  techniques 
seem  to  be  of  independent  interest. 

One  or  more  of  the  following  conditions  will  be  used  throughout  the 
section,  and  will  occasionally  be  weakened.  Until  Theorem  4.9,  x  is  always 
assumed  to  be  in  . 

(A4.1)  The  H- functionals  H  (*,*)  converge  to  H  uniformly  on  bounded 
(x,<*)  sets. 

(A4.2)  U(*)  is  continuous  in  the  Hausdorff  topology. 

(A4.3)  U(x)  and  or  5(*)  are  uniformly  bounded.  (We  will  also  treat 

the  unbounded  case.) 

For  simplicity  we  consider  2  cases,  the  non-degenerate  and  the  degenerate 
of  (3.2). 

(A. 4. 4)  There  is  an  >  0  such  that  for  all  x  either  (non-degenerate  case) 
N  (b(x))  €  U(x)  or  (degenerate  case)  Ne  (b  (x))  €  U»(x). 


0 


0 
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Lemma  4.1.  Under  (A4.1),  lim  L  (x  ,6  )  >  L(x,8),  if  x  +  x,g  -*  6. 

-  -  - n  n  n  —  —  n  n 

n 

Proof.  Let  =  {a:|a|  <_U}  and  define  LN(x,8)  =  sup  (a '  8-H(x,a) ) .  Then 

a€RN 

LN(x,8)  +  L(x,8)  as  N -*•<».  Also  L  (x  ,6  )  >  sup  (ct’8  -H  (x  ,ot))  -*  LN(x,8) 

n  n  n  —  n  n  n 

aeRN 

As  n -*■<».  The  assertion  follows  from  this,  and  the  arbitrariness  of  N.  Q.E.D. 


Let  (T ,<fi )  ,  S^(<}>)  denote  the  action  functionals  corresponding  to  the 
H-functional  H^.  The  next  theorem  is  basic  for  the  subsequent  approximation  results. 

Theorem  4.2.  Let  4>nC*)  -*■  $(•)  uniformly  and  1  im  T (<^n)  =T<“>.  Then , 
under  (A4.1)  -  (A4.4)  and  (A  1.3), 

lim  Sn(<|>n)  >_  S(<p) . 
n 

Remark.  The  case  where  T(<J>  )  =  T  -*■<*>  does  not  have  much  significance: 

-  n  n 

If  T0J>)  <  ®,  then  use  the  fact  that  Sn(t,<j>)  is  non-decreasing  in  t  and 

the  theorem  follows  by  working  on  [0,T(4>)]  in  the  proof.  If  T(<}>)  =  °°  and 

lim  Sn  <  ”,  then  4> (t )  x^,  <}>(•)  never  escapes  from  G  and  S(4>)  is  not  defined.  In  any 

n 

case,  if  each  T C4>  )  <  »  and  sup  S  (<p  )  <  ”,  for  each  e  >  0  we  can  show  that 
n  n  n 

n 

there  is  a  sequence  $e(*)  such  that  S  ((Jj6)  <  s  (<t>  )  +  e  and  sup  T(4>£)  <”. 

n  n  n  —  n  n  r  n 

n 

Proof.  Assume  w.l.o.g.  (choose  a  subsequence  if  necessary),  that 

TTm  S  (  (^  <  and  that  T($  )-*■  T  _>  T  C4>3 ,  T  <  <*»,  and  let  m(*)  denote 
n 

Lebesque  measure.  For  any  K  >  0,  m{t:|  (t)  (U(<j>  (t))),  t  <  T  } -►  0 

n  '  e  n  —  n 

as  n'*-®,  since  L^fx.B)  00  uniformly  in  (3,x)  in  any  compact  subset  of 
{3,x: 6  l  N£(U(x))}  . 
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To  see  this,  suppose  that  there  are  {x  ,8  }  and  K  <  00  such  that  L  (x  ,6  )<  K.where 
rr  n  n  n  n  n  — 

6_  ^  N  and  x  ■>  x,  8  -*■  8-  But  6  l  N  (U(x))  and 

n  e  n  n  n  e 

lim  L^(x  ,g  )  >_  L(x,8)  =  ®  by  Lemma  4.1,  a  contradiction.  By 

n 

this,  the  convexity  of  U(x) ,  continuity  of  U(*),  and  weak  convergence 
of  $  (•)  to  if,),  we  have  m{t:^(t)  l  U Cd> (t) ) ,  t  £  T}  =  0;  in  fact 

it  can  be  shown  that  U(<j>(t))  can  be  replaced  by  U(<j>(t))  there. 

— 

Now  recall  the  definition  of  the  6-interior  set  U  (x) 

and  define  ueCx).  =  N  (U(x))  and  let  i”  (•)  be  the  indicator  of  the  set  on 

which  |  (s)  e  U  (d»  (s)).  We  have 
n  e v  n 

T  *  T  *11 

(4.1)  B  =  lim  r  n  l  (4  (s),$  (s))ds  >  lim  r  nL  (4>  (s),l  (s))I  (s)ds  ?  B., . 

1  — — —  j  n  n  n  - ~  J  n  n  11  «- 

n  0  n  0 

Let  6>  e.  For  large  n  and  small  6  and  e,  there  is  a  measurable  function 

A  (•)  with  |A  (t)  I  <  26  and  such  that  £  (t)  +A  (t)  e  |j  (<f>  (t)) 
n  n  —  n  n  v  nl 

for  all  t  such  that  $  (t)  e  U  ($  (t))  and  such  that  for  these  t  and 

n  £  n 

small  <5  and  e, 

(*)  Ln(*n(t)’*n(t)  +  Vt)}  -  Ln(*n(t)  ’*n(t)  5  +  V 

where  6^  -*■  0  as  6  +  0. 

To  prove  the  last  assertion,  define  hfi(x,6)  as  follows,  for 
—6 

8  c  U  (x)  -  U  (x) .  (We  do  the  non-degenerate  case,  the  proof  in  the 
e 

degenerate  case  is  almost  the  same.)  Let  h^(x,B)  he  the  unique  intersection 


on 


1 
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3U  ^x)  of  the  line  segment  (z:z  =  sB  +  (l-s)b(x),  0  £  s  <_  1} 

— jS 

connecting  B  and  b(x).  If  6  *  U  (x) ,  set  h^fx.B)  =  B.  Then 
( <f>n C * )  ,$n(*))  ■  4>n(*)]  l"c*)  =  An(*)  *s  measura^^e-  Now  we 

prove  (*) .  Suppose  that  (*)  is  false.  In  particular,  suppose  that  for 

each  small  6,  >  0  there  are  x,B,  x  -*•  x,B  -*■  B, 6  -+  0,  e  -*■  0.  with  S  €  L)  (x  ) 

0  ’  n  n  n  ’  n  n  £  n 

n 

and  such  that  Ln(xn,  h^  (xn»3n))  -  ^n^xn>^n)  —  ^q-  Then  t^ie  convexity 


of  L  (x,»)  and  the  fact  that  h.  (x  ,8  )  -  B  -*■  0  as  n  "*■  °°  implies 
n  o  n  n  n  r 

n 

that  the  derivative  (in  the  direction  of  increasing  s)  at  some  s  =  s^  ■+  0 

along  the  line  segment  (sBn  +  (l-s)b(xn)}  increases  to  ®  as  n  +  ».  By 
convexity,  the  derivative  is  non-decreasing  as  s  increases.  This  and  the 

_ X 

uniform  convergence  L  (•,•)  -*•  L(*,  •)  on  {x,B:  x  t  compact  K.BfU  (x)}  for  each 

K  and  6  >0  lead  to  a  contradiction  to  (A4.2),  (A4.4).  In  particular,  we  get 

L(x,b(x))  =  °°,  contradicting  L(x,b(x))  =  0.  Thus  (*)  holds. 

By  (4.1)  and  (*) 

T 

■  n 

B2  I  lim  Ln(*n(t),in(t)  ♦  An(t))I^(s)ds  -  6jT. 
n  0 


By  Corollary  3.6, 


lim  lim  sup  | (y, S) -L  (x , B) ) |  =  0,  y,x  e  Compact  in  Gj . 

e  n  | x-y | <e 

_ X 

BeU  (x) 

Thus,  by  the  uniform  convergence  ^(*)  •>  $(•) ,  for  each  0  there  is  an 

Cp  >  0  such  that  1 1 - t  |  <  e()  implies  that  for  large  n 


I L  (♦  (T),0)  -  L  (0  (t)  ,6)  |  <  6n,  BeU0  (<Dn(t)). 

n  n  n  n  —  i>  n 


•  •  •  J 


q}  such  that 


Define  a  finite  sequence  ft^,  i  =  1, 

Vi  *  V  Vl  -  *i  i  e0'  ‘o  =  °>  ■  T*V  a"d  set  Ln(Vt,’Vt,)’  0 

for  t  >  T  .  Then  (the  last  inequal  i  tv  he  low  uses  Jensen's  inequality  and  the 
—  n 

convexity  of  Ln(x,  •  )) 


(4.2) 


B2  >  -y  -  y 


*  ii»  l 

n  i 


ti+l 


L  (<t>„(t4)»?„(s)  +  An(s))I  (s)ds 


t. 

1 


n  Tn  i  n 


>  -cvVT  *  iis  i.r'PWV ■ 


where 


fn,c  s 


^i+rV  J 


t.  . 

1  ($n(s)  +  An(s))I^(s)ds. 

ti 


Assume  (or  take  a  suitable  subsequence)  that  A^f*)  converge  to  a 


function  A(*).  Define 

*(ti+1)  -  -Htj)  A(ti  +  i)  -  A(ti) 


fi=  [ 


t.  ,  -  t. 
i+l  i 


ti+r  ti 


] 


Then  f?’e  +  f .  as  n  +  <*>,  for  each  c>0.  By  Lemma  4.1,  (4.2)  and  the  lower 
semicontinuity  of  L(*,*)  and  its  continuity  on  (x,  g  :  BeU°(x)}^ 


(4.3)  Bj  >  -  T^+fi^  +  I(ti+1-ti)L(4>(t.),fi) 
rT  1 


e(T° 


-T6X  + 


L(4>(s) ,  $(s)  +  A(sf)ds . 


Finally,  letting e -+  0,  6  •+  0  and  again  using  the  lower  semicontinuity  of 

L(*,*)»  yields  the  theorem.  Q.E.D. 
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We  next  treat  an  unbounded  U(x)  case. 


(A4.S)  Let  (a)  inf.  M*.»B)  -+■  °°  as  |  B I  -*•  °°, 

xeG1  161 

(b)  (nondegenerate)  sup  sup  L(x,B)  <  00 ,  aU  B<«. 

|b|<b  xCGj 

(degenerate) ,  let  8^  =  b^(x),  and  take  sup  only  over 

\B2\  1  B. 


The  conditions  hold  for  (1.2),  (1.4)  if  (non-degenerate  case)  a(x)a'(x)  is 
uniformly  positive  definite,  and  (degenerate  case)  if  a2(x'a'2(x)  is  uniformly 
positive  definite. 

Theorem  4.3.  Under  (A1.3),  (A4.1),  (A4.5),  the  conclusions  of  Theorem  4.2  hold. 

Proof .  Let  <P  (•)  -*  <)>(•)  uniformly  and  w.l.o.g.  let  Tim  S  C<J>  )  <  00  and 

n 

T(<j>n)  =  Tn  -*•  T  ^T(<)>),  T  <  ®.  For  notational  simplicity,  we  do  the  nondegenerate 
case  only.  The  proof  for  the  degenerate  case  requires  only  minor  modifications. 
Since  U(x)  =  entire  space.  Corollary  3.6  implies  that 

Ln(x,B) 

(4.4)  lim  lim  inf  — r— i —  =  <®. 

|s|-«>  n  xGGj 

Also,  (4.4)  and  lim  S  (<|>  )  <  ®  imply  that 
n 

(4.5)  lim  lim  m(t:t  £  T  ,  [$  (t) |  >  K)  =  0. 

£*»  n  n 
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Define  l”(*)  *  indicator  of  set  where  |$n(t)|  £  K.  Then  (4.j)  holds 

with  l[!  replacing  In.  By  the  uniform  convergence  of  L  (•,•)  to  L(*,») 
k  g  n 

on  bounded  sets  and  the  continuity  of  L(*,»),  for  each  >  0,  there  are 
eQ  >  0,  {t^},  t^  =  0,  0  <  ti+j  -  t^  <_  as  in  Theorem  4.2,  and  such  that 

fT 

n 


lim 

n 


Ln(*„(s)*<Ms>>Ms)ds  >  -5nT  * 
n  n  n  K  —  U 


ti*l 


L  (♦rt(tJ,*n(s))I2(s)dS. 
n  n  l  n  K 

t. 


i 


The  proof  is  completed  in  essentially  the  same  way  that  the  proof  of  Theorem 
4.2  was  completed,  except  that  K  ->  ®  replaces  e  -*■  0,  there  is  no  need 
to  introduce  An(*)»  and 

rT 

_  n  . 

(4.6)  lim  lim  L  (s) | (l-l"(s))ds  =  0 

K  n  0  n  K 


is  used  to  get 


Ci*l 

t. 

l 


4> Ct^) ,  as  n  -*•  <=°,  then  K  ■*  =>. 


Q.E.D. 


Limits  of  {Sn>.  The  functional  corresponds  to  a  system  of  one  of 
the  types  (1.1)  to(1.4)  with  dynamical  terms  b,  F,  a  subscripted  by  n  and 
C”  replacing  where  the  'mean'  dynamical  term  is  bn(«).  As  n  +  ■»,  bn(x)  "*  b(x)  and 
many  types  of  assumptions  on  the  behavior  of  x  =  b^(x)  can  be  dealt  with. 

Here  we  simply  assume  (A4.6). 


(A4.6)  The  system  corresponding  to  H- functional  Hn  satisfies  (Al.l),  but 
where  ^  replaces  and  x^  ->•  as  n 
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For  the  degenerate  case,  we  need  the  'controllability'  condition 
(A4.7).  In  the  non-degenerate  case,  with  the  unbounded  U(x),  CA4.7)  always 
holds  if  the  conditions  |  2 1  M  and  <j>j  =  bj(<f>)  are  replaced  by 

|$|  _<  M.  In  the  non-degenerate  case  with  bounded  U(x),  (A4.7)  always 
holds  if  the  condition  $2(t)  e  U^(<}>(t))  is  replaced  by  $(t)  ^  CT5  Cd> Ct ) ) . 

(A4.7)  (Unbounded  U(x)  case.)  There  is  an  M  <  00  such  that  for  each 

small  e.,  >  0  and  each  ye  N  (x„) ,  there  is  a  function 

-  4  -  e  2  0 

4>(0  =  ($,(•)  ,<j>2(*))  such  that  $(0)  =  xQ  ,  <Ktp  =  y  for  some  ty<_  T, 

•  —  .  •  . 

where  T  -+■  0  jis  ->  0,  and  ^  =  bj(<J>),  |<J>2I  1  M- 

(Bounded  U(x)  case. )  Simply  replace  M  and  |  d> ->  I  M  by 
*  -6 

<P2(t)  €  U2(4>(t)),  for  some  6  >  0. 

Theorem  4.4.  (Unbounded  U(x)  case)  Assume  (A4.1),  (A4.5),  (A4.6)  and 
(A4.7)  (for  the  degenerate  case)  and  (Al.I),  (A1.3),  (A1.4).  Then 


S  ■+  Sn . 
n  0 


Note  (A1.2)  is  not  used  here.  The  theorem  makes  no  direct  claim 
concerning  escape  times  and  the  H-functionals  are  defined  by  (1.5). 


Proof.  Fix  e  >  0,  let  SQ  <  »  and  let  $  (•)  be  an  e-optimal  path 

for  S(.)  with  d>e(0)  =  xn.  Write  T  =  T(4>£).  Below,  we  show  that  for 

U  e 

F  ' 
small  ej  >  0,  $  (•)  can  be  selected  such  that  it  is  defined  until  T  , 

1  _ 

the  exit  time  from  N  (G) ,  and  S(T  ,*  )  <  +  3e  and  for  some 

G  <9  G  “  U 


K  <  <*>,  |$E(t)|  <  K,  and<j>£(«)  is  not  tangent  to  any  of  the  boundary 
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curves  at  the  exit  point  from  G.  Assume  this  for  the  moment.  In  this 
part  of  the  proof  we  do  only  the  (more  difficult)  degenerate  problem. 


Define  <j>E 
n 


by  (in  the  non-degenerate  case  we  would  set 


♦®(0)  =  = 


(t) 


(t) 


=  x 


2n 


•  t 

b  (^(s))ds 

J0  ^  n 

t 

$E(s)ds, 

0 


where  is  defined  in  (A4.6),  and  b^  =  (b^.b^).  Recal*  tbat  b(*)  is.  Lipschitz 
continuous. Then, by  the  properties  of  <P  ebs umed  in  the  last  paragraph,  =  T(<J)n)  <  00 

for  large  n  and  T£  -*T  as  n  ».  By  the  boundedness  of  <(>E(*)  and  the 
uniform  convergence  of  Ln  (x>'bjn  (x)  >  ^o)  to  L(x;F^  (x)  ,3^)  on  bounded  (x,  B.,)  sets. 


(4.7a)  S  =  S  (x  )  =  S  (T  ,<(.  )  = 
v  n  n  n  nl  n  yn 


n  Ln^n^*  bu((t)nfs^  *^2*-s^ds 


i: 


L(4>E(s);  b:  (<PE(s)),  <3>^(s))ds  <  SQ  ♦  3e 


Thus 


(4.7b)  .  lim  S  <  Sn. 

n  n  ~  0 

We  now  show  that  there  is  a  <J>  (•)  of  the  desired  form. 

P  £  _ 

Let  <j>  (•)  be  an  e-optimal  path  for  S(<f>)  with  <t>  (0)  =  xfl.  We  do  the  non¬ 
degenerate  case  only,  for  the  sake  of  notational  simplicity.  A  very  similar 

£ 

construction  yields  (•)  of  the  desired  form  for  the  degenerate  case. 
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V  c 

Let  !*(•)  denote  the  indicator  of  the  set  where  |$  (s)  |  <_  K.  By  (A4.5) 


and  Sg  <  ®, 

(4.8) 

Te 

lim  [  U£(s)|  (1  -  I*(s))  ds  =  0 

K-w»  J0 

For  y  e  G 

and  any  K  <  ®  define  4>y(*)  by 

♦J(t)  »  y  +  *e(s)i*(s)ds  . 

y  Jo 

There  is  an  M  <  00  and  >  0  such  that  for  each  y  satisfying 
|y-x0|  1  e2  there  is  a  ^C*)  satisfying  <i>y CO)  =  x0,  $y(ty)  =  y,  with 

*  A 

|<j>y(t)|  m  and  SCt  .♦*)  ^  e»  where  ty  -*■  0  as  De^ne 

^(*)  by 

♦J(t)-JJet).  t<ty 

=tyCt-ty),t>  ty. 

By  (4.8)  and  the  continuity  of  L(’,*),  we  can  find  a  sequence  (  ya,Ka  ^ 

where  Ka  -►  00  as  a  ■+  °°  and  such  that  for  large  (•)  satisfies  the 

Ta 

conditions  required  on  4>e  ( • )  in  the  first  paragraph  of  the  proof  (where 

e,  now  depends  on  the  chosen  y  ,K  ).  Recall  S  =  inf  (S  (<j>):<{>(0)  =  x  }  =  S  (x  ) . 
3  r  a  a  n  n  n  n  n 

Now,  to  get  the  reverse  inequality  to  (4.7b)  for  either  the  degenerate 

or  the  non-degenerate  case,  let  sup  S  <  °°  and  let  4>e(*)  be  the  e-optimal 

n 

path  for  Sn($).  We  can  select  <J>^(*) 


-28- 


such  that  Te  =  T(d>e)  -*■  T  <  ®.  Let  I^’0^)  denote  the  indicator  of 
n  Tn  e 

the  set  where  >  K.  By  (A4.5),  the  convexity  of  L^Cx,*) 

and  L(x,»)  and  the  uniform  convergence  on  bounded  sets,  for  each  large 
N  <  «>  there  is  a  1C,  <  <»  such  that 


(4.9) 


yy  in  J  niyt)|ieN>  (t)dt 


for  large  n.  Thus,  the  set  i d>n C * ) »  n  large,  e  >  o)  is  uniformly 

absolutely  continuous.  Extract  a  convergent  subsequence,  indexed  by  n, 
— £  £  _ . 

and  with  limit  <J>  (  •) ,  where  "?  (0)  =  x^.  By  Theorem  4.3, 

(4.10)  e  +  lim  Sn(xn)  >  lim  Sn(4>^)  >  S(^)  >  SQ, 
n  n 

lim  S  >  S„. 

~  n-  0 


Thus,  S  ->  S-. 
n  0 


Q.E.D. 


A  useful  special  case  is  given  by  Theorem  4.5.  See  also  Theorem  4.6. 


Theorem  4.5.  Let  the  H-functionals  satisfy  Hn(x,q)  i  H(x,a),  each  x,<*. 

Then  S  <  S„  and  under  the  conditions  of  Theorems  4.2  or  4.3,  S  -*• 

- —  n  —  0  -  —  n  0 

as  n  -*■  “. 

The  theorem  is  obvious,  since  Ln(x,8)  <_  L(x,8).  a  case  of  particular 

interest  is  where  b(x,S  )  =  b(x,£  )  +  b  (x,£  ),  and  (C  1  and  (S  )  are 

n  n  n  n  n  n 

independent  of  one  another  and  Hn(x,a)  ■+  o,  uniformly  on  bounded  (x,<*) 

A  ~  A  » 

sets  (where  H  and  H  are  the  H-functionals  corresponding  to  b  and  b, 
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respectively) .  Then  if  the  system  corresponding  to  b  satisfies  the  conditions 

of  Theorems  4.2  or  4.3,  S  -*  S_  as  n  -*■ 

n  0 

The  H-function  for  (1.2)  or  (1.4)  takes  the  form  (where  is  the 
H-functional  for  a  =  0) 

H°(x,a)  =  H°(x,a)  +  a'a(x)a'(x) a/2. 

Theorem  4.6.  Let  H^(x,a)  =  H^(x,a)+alo(x)0 '  (x)a/2,  where  we  assume  the 

conditions  of  Theorem  4.4  with  H°  and  replacing  H  and  resp. 

Then  as  n  ->  Furthermore,  if  satisfies  (A4.1  to  4),  (A1.3) 

the  bounded  U(x)  case  or  (A1.3),  (A4.1),  (A4.3)  in  the  unbounded  U(x)  case, 
then  Sp  ■>  as  o  -*■  0. 

The  theorem  follows  from  Theorems  4.2  to  4.5.  Thus,  when  the  system 
contains  (independent)  Gaussian  noise,  the  exit  times  are  robust  with 
respect  to  changes  in  the  other  system  noises.  Also,  the  addition  of  small 
Gaussian  noise  changesthe  exit  times  only  slightly  under  broad  conditions. 

Theorem  4.7.  (Bounded  U(x))  Assume  (Al.1,3,4)  and  (A4 . 1 , 2 ,3,4 ,6, 7) . 

Suppose  that  for  each  c  <  0,  there  is  a  5  >  0  such  that  there  is  an 

e-optimal  path  <j,E(.)  (with  $E(0)  =  x)  for  S(. ) ,  with 

#e(t)  «  U4  (<J>ft)).  Then  S  S.  as  n  -*■  °°. 

n  u 

The  proof  uses  arguments  developed  in  the  theorems  of  this  section 
and  only  a  few  comments  will  be  made.  To  get  (4.7b)  we  roughly  follow 
the  proof  of  that  result  in  Theorem  4.4.  The  controllability  (A4.7), 
the  continuity  of  U(*),  and  a  pieceing  together  argument  (such  as  used 
for  the  construction  of  $  (•)  in  Theorem  4.4)  are  used  to  get  an 
c-optimal  path  (starting  from  x^)  which  satisfies  the  requirements  of 


in 
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the  third  sentence  of  the  proof  of  Theorem  4.4,  except  that 

|||(t)  |  is  replaced  by  (degenerate  case)  ^(t)  e  ((f>G  (t)),4l^(t)=b1  ($C(t)) ,  or 

(non-degenerate  case)  <j>e(t)  e  U^(if>e(t)),  for  some  6  >  0.  Then  define 

^(•)as  in  the  second  paragraph  of  the  proof  of  Theorem  4.4,  and  in  the 

analogous  way  for  the  non-degenerate  case.  There  is  a 

6'  >  0  such  that  for  large  n,  (degenerate  case)  ^(t)  e  (4>G(t)) 

•  t  __x  t  t 

(and  ^n(t)  £  U  (<f>n (t) )  for  the  non-degenerate  case. 


Then  use  (4.7a)  (or  the  analogous  formula  for  the  non-degenerate  case)  and 

—6 ' 

the  convergence  L  (*,*)  -*•  L(*,‘)  uniformly  on  (x,B:  x  e  compact  K,  BeU  (x)} 

to  get  (4.7b).  The  proof  of  (4.10)  is  very  similar  to  the  proof  used  in 
Theorem  4.4,  whether  or  not  the  U^(x)  are  bounded.  The  appropriate  con¬ 
vergent  subsequence  of  {<J>G(0)  is  extracted  by  using  the  nature  of  the 
convergence  of  L^i-,.)  -*■  L(*,-)  and  the  boundedness  of  the  U(x). 

In  the  next  theorem  we  show  that  the  U  (x)  -  approximation  required  by  the 
last  theorem  exits  under  reasonable  conditions.  We  actually  show  the  existence 
of  a  slightly  modified  set,  called  U°(x),  which  can  be  used  in  place  of  U  (x). 
For  0  <  6  1,  define  {x^,(S}  by 


*Z;5  =  xj’*  +  yb(xY’6)  +  Xl-<5)b(xY’6,£. )  xY>6  =  xl  -  Xn 


_  6  r  y  6-,  ~ 

where  b  =  b  -  b  .  Let  L  denote  the  L- functional  for  Ix^  1  and  let  H  denote  the 

H-functional  for  b.  Then 

L5(x,8)  =  sup[a' (S-b(x))  -  H(x,(l-6)a)] 
a 


*  L(x,  +  b(x) ) , 


where  v  ■  B  -  b(x) .  Define  U°(x)  by:  B  <  U  (x)  if  B  =  b(x)  ♦  (l-6)v,  where 
b(x)  ♦  v  i  U(x).  Clearly,  under  (A4.4),  D^(x)  can  be  used  instead  of  TT^(x)  in 
the  previous  theorems  (analogously  for  U°fx)  and  O^fx)  in  the  degenerate  case). 
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—  X  -*  £  ^ 

Define  L  (♦,♦)  by  :L  (x,B)  =  L(x,$)  for  g  e  U  (x) ,  and  equal  to  infinity 
otherwise.  Let  S  (•)  denote  the  action  functional  corresponding  to  L°(*). 
Let  xY,<S(.)  denote  the  piecewise  linear  interpolation  of  {x£»  ^}with 
interpolation  interval  y. 

Theorem  4.8.  Under  (A4.1  to  4)  and  (Al.l  to  5) ,  as  5  -*•  0.  If 

(A4.6,7)  also  holds,  then  S  -*  S„. 

- • -  n  0 

-  r 

Remark .  The  first  sentence  of  the  theorem  implies  that  U  (x)  satisfies 
the  requirements  put  on  u  (x)  in  Theorem  4.7. 

Proof.  For  notational  reasons  only,  we  work  with  the  non-degenerate  case. 
First  we  show  that  for  each  compact  x-set  K  there  is  a  c(6)  which  goes 

«*  r 

to  0  as  6  -*>  0  and  such  that  (if  g  /  U  (x) ,  then  both  sides  are  infinite) 
(4.11)  L6(x,B)  <  L6(x,6)  +  c(6) ,  x  e  K. 


Suppose  (4.11)  is  false.  Then  there  are  c  >  0,  6  •+■  0,x  eK  and  b(x  )  +  v  = 

rr  n  n  n  n 

-^n  ,5 

6n  e  U  (xn)  and  6n  0  such  that  (recall  the  form  of  L  given  above  the  theorem) 

(*)  L(VF(xn}  +  V  _  L(VlV  +  F(xn»  -  C’ 

This  relation  is  impossible  unless  d(b(x^)  +  vn,  9u(xn))  -*■  0  as  n  +  ">.  Using 
this  and  (*)  and  the  convexity  of  the  L(x,*)»  we  get  that  L(xn,b(xn))-*  -»as  n  -*•  ”, 
a  contradiction.  Thus  (4.11)  holds.  We  can  show  that 


lim  ~6 

~r  S0 


>  S 


0 


(4.12) 
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by  a  proof  similar  to  that  in  the  last  part  of  Theorem  4.4.  (See  also  the 


comment  after  Theorem 

4.7) . 

We  now  adapt  a  device  used  in  [5, 

Lemma 

5.1].  Let 

S  <  “>.  For  T  <  °° 

u 

and  small  p  >  0, 

define  the  sets 

AY>6 

Y  6 

=  event  that  x  ’  (•)  leaves 

y 

by  time  T,  and  A  = 
0 

event  that  xY(*) 

leaves  N  (G) 
p 

by  time  T.  For 

small  6  >  0,  P{AY4 
P 

>  P{Ay}  .  For 
p 

each 

small  h  > 

0,  there  is  a 

p  >  0,  a  T^’p  (which  we  can  suppose  is  bounded  uniformly  in  h,p  )  and  a 
function  such  that  <j>^’^0)  =  Xg,  i^’p(T^,p)  c  3N  (G)  and  (recall 

(A1 .5) )  S0  <  S(Th’p,*h*p)  <  SQ  ♦  h.  Then,  by  (1.6), 

P{AY}  exp  -  [Sq  +  2h]/y  for  small  y  .  Let  A  =  set  of  continuous  paths 
$(•)  which  leave  G  by  time  T  and  have  4>(0)  =  xQ.  Then,  for  each  h 

there  is  a  yQ  >0  such  that  for  y  ±  yQ, 

P(Ay’6}  <  exp  -  [  inf  S6(T,4>)  -  h]/y 

4>tA 

Combining  (4.11)  with  the  estimates  in  the  last  paragraph  yields  that 
for  some  T^  <  <» 

(4.13)  S&  -C(S)T  <S®<  inf  S5(T,*)  <  SQ  ♦  3h, 

<j>e  A 

where  TQ  can  be  taken  to  be  an  upper  bound  (over  small  6  >  0)  for 

•  *  jr  x  jj 

(T(<fr  )},  where  <f>  are  such  that  S  (<J>  )  £  S  +  6.  Combining  (4.12)  and 

(4.13)  yields  the  first  assertion  of  the  theorem. 

•  f  _ f 

Using  U  (x)  for  the  lr(x)  in  Theorem  4.7  yields  the  last  assertion 


of  the  Theorem. 


Q.U.D. 


Let 

inf 
<£eA 

Then 

-Sn(T,A°)  £  lim  y  lo8  P(^(0  e  A}  Tim  y  log  P{x^(*)  e  A}  ±_sn(T,A), 

Y  Y 

where  Hn  is  the  H- functional  arising  from  the  processes  {x^(»)t  y  >  0}, 

n  =  1,2,  each  of  which  is  of  the  form  (1.1),  (1.2)  or  the  interpolation 

of  forms  (1.3),  (1.4)  for  suitable  ^.o^,  b^S^1  replacing  b,  o,  b,  and 

S  is  the  Cramer  transformation  of  H  .  Let  A,  >  0  and  compact  G  be 
n  no 

such  that  for  <j>(*)  e  N  (A) «  <KT)  ^  ^ 

0 

Theorem  4,9.  Let  S(T,A°)  =  S(T,A)  if  the  U(x)  are  bounded.  Assume  (Al.2’,3) 
where  Gj  is  replaced  by  GQ  in  (A1.21).  Assume  (A4.1.2)  and  also 

(A4.3,4)  for  the  bounded  U(x)  case,  and  (A4.5)  for  the  unbounded  U(x) 

case.  Then  in  the  bounded  U(x)  case,  S  (T.A)  -►  S(T,A°),  and  in  the  unbounded 
-  -  n  - — - — — — 

U(x)  case»  lit  Sn(T,A)  >  S(T,A) ,  ljjn  S  (T,A°)  <  S(T,A°). 

n  n  n 

Proof.  Only  an  outline  will  be  given.  The  techniques  are  similar  to  those 

£ 

used  in  the  previous  theorems  of  this  section.  Fix  e  >  0.  Let  ^n(*)  be 
such  that  S  (T,$E)  <  S  (T.A)  +  e.  We  can  always  choose  such  a  sequence 


There  are  also  approximation  Theorems  for  the  inequalities  (1.7). 
A  be  a  set  of  continuous  functions  on  [ 0 , T] .  Define  S^fT.A)  = 

Sn(T,$). 


for  which  there  is  a  convergent  subsequence.  Let  n  index  the  subsequence 


-34- 


and  denote  the  limit  by  4>  (•)•  Then  <t>  (*)  e  A  and  by  Theorem  4.2 
and  the  arbitrariness  of  e  >  0.  we  have 

(4 . 14)  limS  (T,A)  >  S(T,A). 

n 

To  get  the  reverse  inequality,  first  consider  the  'unbounded  U(x)' 

case,  and  let  be  an  e-optimal  path  for  S(T,A^)  such  that  l4>£(t)l 

is  bounded.  Then  use  an  argument  similar  to  that  used  in  connection  with  (4.7a) 
to  get 

(4.15)  Ui  Sn(T,A°)  <  S(T,A°) , 

n 

and  the  Theorem  is  proved  for  the  unbounded  U(x)  case.  We  need  not  concern 
ourselves  with  'exit  times'  in  this  Theorem. 

Now,  to  complete  the  proof  for  the  bounded  U(x)  case,  we  use  the 

Y  ^ 

technique  and  terminology  of  Theorem  4.8.  Define  x  ’  ( •)  as  above  Theorem  4.8. 
Suppose  that  A^  is  non  empty  and  for  small  p  >  0  define  the  open  set 
Ap  =  {$:  4)€  A,d($,  3  A)  >  p}.  For  small  6  >  0 

(4.16)  P(xY’  6(*)e  A}_>P&Y(0  e  A  P }, 

and  for  each  h  >  0  there  is  a  y(  h)  >  0  such  that  for  Y*[_y(h),  the  right 
side  of  (4.16)  is  _>  exp  -  [S(T,AP)  ♦  h]/y  and  the  left  side  is 

exp  -  [S5(T,A)  -  hj/y.  Using  the  terminology  and  result  of  Theorem  4.8, 

S  fr,A)  -  c(6)T  <  S6(T,A).  Now,  by  the  hypothesis  S(T,A°)  =  S(T,A), 

we  have  that  S(T,AP).  +  S(A)  =  S(A°)  as  p  -  0. 
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Now  \ 

V  N 

(4.17)  h  +  S(T,A°)-« —  S(T,AP)  +  h  >  S5(T,A)  -  h  >  S6(T,A)  -  c(fi)T.  -  h  . 

p->-0 

Also,  as  in  Theorem  4.8, 

(4.18)  lim  S6(T,A)  >  S(T,A)  =  S(T,A°). 

6 

Relations  (4.17),  (4.18)  imply  that  lim  S^(T,A^)  =  S(T,A^).  Thus,  for 

6-*-0 

£  0 

small  6  >  0,  we  can  select  an  e-optimal  path  0  for  S(T,A  )  with 
$e(t)  e  U6  (4>e  (t) )  (or,  equivalently,  in  (<J> e  ( t ) ) ,  t£T,  if  we  wish). 


The  proof  of  (4.15)  follows  from  this,  the  convergence  of  L  (*,*)  to 

__  £ 

L(*,*)  uniformly  on  {x,8:  x  €  GQ,  8  G  U  (x)}  (or  the  analogous  result  for  the 
degenerate  case)  and  the  boundedness  of  U(x) .  Q  K  D 

5 .  Examples  of  convergence  of  to  Gaussian  H-functional . 

5.1  Let  b^(*)  and  b^f'.C)  be  Lipschitz  continuous,  uniformly  in  £.  Let 

N  -*•  »  as  n  -*•  °°,  and  let  {£,  . ,  i  >  0,  k  >  0)  be  i.i.d.  with 
n  ki’  —  '  - 


E  bn(x,Cki)  =  0,  and  define  b^(x,C™)  =  l  b„(x,Sv;)/  */N^.  Define  {  xY) 


n  k'  ,  ,  n  ki  n 
1=1 


by  (suppress  the  n  index  on  x,  )  by 


(5.1) 


xk+l  =  Xk  +  ^bn(xk^  +Ybn(xk’^- 


Let  denote  the  H-functional  when  b^fx)  =  0.  for  convergence  to  the 

Gaussian  H-functional  H(xS,a)  =  a’b(x)  +  a'I(x)a  /2  we  need  E"  (x)  ->  b(x)  and 
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H  (x ,a)  =  N  log  E  exp  a'b_(x,£.  ,  )//n7 ->•  ot'S(x)a/2, 
n  n  h  k  x  n 

uniformly  in  x  e  G^  for  some  smooth  l(x).  If  the  £  are  bounded, 

then  clearly  £(x)  =  lim  E  b(x,£  )b'(x,£,  .).  But  in  general,  the  con- 

n->co  n  ki  n  kl 

vergence  or  lack  of  it  depends  on  the  higher  moments  of  b^fx,^.). 

N 

n 

5.2.  Now  let  b  (x.sl1)  =  7  b  (x.C,  .),  where  N  is  Poisson  with 

n  k  .  ,  n  ki  n 

i  =  l 

parameter  A  ,  and  for  each  n,  Eb  (x ,  .  )  =  0  and  {  £?.  ,  k  >  0,  i>0} 

r  n  n  ki J  ki  —  — 

are  i.i.d.  for  each  n. 

Then 

H  (x,ot)  =  A  [E  exp  a'  b  (x,£.n.)  -  1], 
n  n  r  n  ki 

Let  A  ->  «  and  A  E  b(x,cf.)  b'(x,  £;[*.)  T(x)  uniformly  in  x  e  G  ,  as  n  -*•  °° 

n  n  ki  ki  1 

Then,  for  to  converge  to  the  Gaussian  H-functional ,  it  is  sufficient 

<X> 

that  b  (x)  -*■  b(x)  and  that  A  £  |a  |  |b(x,  . )  |  ^ / 1 !  -*•  0  uniformly 

n  n  ^  k  i 

in  bounded  a-sets,  as  n  -*■  <=°.  This  depends  on  the  higher  moments  of 

h  (x,£l\).  If  |b  (x,?f.)|  <5  ->0  as  n  -*■  “  and  lim  A  62  <  °°  then 

n  ’  ki  n  ki  '  —  n  n  n 

n 

the  sum  converges  to  zero  as  desired. 

5.3.  Consider  the  continuous  parameter  case 

(.5.2)  dxY  =  b(xY)dt  +  o(xY)dJn(t/y), 

where  Jn(*)  is  a  centered  Poisson  jump  process  with  rate  A^  and  jump 
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random  variables  {£.}.  Then 

1 


H  (x,a)  =  a'b(x)  +  lim  y  log  E  exp  a' 

n  y^O 


1/y 


a(x)dJ"(t) 


(5.3) 


a'b(x)  +  Xn[E  exp  a'o(x)£.  -  1], 


and  the  comments  made  in  the  discrete  parameter  case  also  apply  here. 

5.4  Let  J(*)  be  a  jump  process  with  jump  intervals  c  >  0  and  bounded  i.i.d. 

jump  random  variables  {^.  }  with  EtJ  =  0  and  consider  the  system 

i  i 

(5.4)  xY  =  b(xY)  +  v(xYX(t/y), 

where  £(•)  is  the  filtered  noise 


(5.5) 


€(t)  =  /t  h(t-s)dJ(s). 
0 


For  computational  simplicity,  let  h(s)  =  exp  -  as,  a  >  0  and  set 


K  =  1/cy. 

Y 

Then 

I 

r  i/y 

r  1/y 

C(t)dt  = 

dt 

J 

0 

0 

1/6 


dJ(s) 


h(t-s)dJ(s) 

0 

l/y 

h (t-s)dt  =  l  [1 

s  i<l/cY 


exp  -  ac(^r  -  i)] 


Thus 


lim  y  log  E  exp  av(x) 
Y->0 


1/y 


C(s)ds 


0 


rs  61  =  iim  Y  log  (E  exp  ^--^)  Y 

1  0;  Y-K)  d 


=  ~  log  E  exp 

C  a 
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Now,  replace  (c,  a,  ij^)  byfc^.a^  »♦” ) »  let  1  <$n>  where  ^n/an  ^  0 

2  2  2 

an  n  -*■  «.  Let  lira  E  G)i  )  /a  c  =  u  >0.  Then  as  n  -*•  “,(5.6)  converges 

n  n 

2  2  2 

to  the  Gaussian  form  a  v  (x)u  /2.  If  the  deterministic  intervals  c  were 
replaced  by  i.i.d.  and  exponentially  distributed  intervals,  the  (5.4),  (5.5) 
would  be  close  to  actual  physical  noise  models.  We  expect  that  the  same 
conclusions  would  hold  in  this  case,  suggesting  that  the  Gaussian  approximation 
is  indeed  useful  for  a  large  class  of  physical  noise  models. 

6.  A  Phase  Locked  Loop  (PLL)  Example. 

This  example  does  not  completely  fit  the  previous  theorems,  but  it  represents 

an  important  and  interesting  class  of  applications  where  further  work  is  required, 

but  where  approximation  theorems  such  as  those  here  are  essential  if  the  results 

are  to  be  physically  meaningful.  Let{z^(*),  i  =  1.2}  be  mutually  independ- 

dent  with  z^(0  scalar  valued  Gaussian  with  mean  zero  and  integrable  covariance 

function  p(0,  and  ^  uniformly  distributed  on  [0,2 7T J  .  A  standard  method  of 

v 

representing  wide  bandwidth  but  'band  pass’  noise  n  (•)  in  communication 
systems  is  by  using  the  form 

nY(t)  =  [ zY (t )  cosfcjgt  +  </»x)  +  z^(t)  sinCw^t  +  if>2)]  , 

where  wY  =  un/y,  zT(t)  =  z,(t/q  ),  +  0  as  y  +  0.  For  notational 

U  U  1  1  y  6 

v 

simplicity,  we  set  =  0.  The  bandwidth  of  n  (•)  is  0(l/q^.)  and  the  center 
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frequency  0 (1/y) .  Let  the  input  yY(*)  to  the  system  be  the  sum  of  a  signal 
plus  noise 

yY(t)  =  AY(t)  sin(wYt  +  9)  +  nY(t), 


where  AY(t)  =  A^Ct/q^)  is  a  deterministic  signal.  Suppose  that  there  is  a 
constant  A  )i  0  such  that  the  convergence 


lim 

Y 


V\ 

Vq. 


A0(t)dt  =  A(TrT0) 


is  uniform  in  (T^-T^).  As  noted  in  more  detail  at  the  end  of  this  section,  the 
function  of  the  PLL  is  to  track  changes  in  0(*).  a  job  of  fundamental  importance 
in  many  modern  low  error  communications  systems.  [1C],  [11].  The  scaling 
used  here  for  the  input  signal  and  noise  allows  us  to  exploit  the  asymptotic 
method,  but  the  general  type  of  scaling  used  is  consistent  with  that  required 
by  many  applications  where  the  center  frequencies  and  bandwidths  are  large  but 
the  bandwidth  is  small  relative  to  the  center  frequency.  In  fact  to  use  asymptotic 
methods  on  such  problems  (i.e.,  to  be  able  to  replace  the  actual  system  noises  by 
simple  stochastic  processes) ,  such  a  relation  between  the  bandwidth  and  center 
frequency  seems  to  be  required. 

The  dynamical  equations  of  the  two  forms  of  PLL  of  Fig.  1  are  given  by  (6.1a) 
and  (6.1b),  respectively. 

vY  =  DvY  +  HwY,  eY  =  cvY, 

(6.1a)  wY(t)  =  yY(t)  cos(“Yt  +  0Y(t)). 


(6.1b) 


9r)  term  is 


generated  by  the  systems  Voltage  controlled  oscillator*.  Also  vY(*)  is  the  state 
of  a  stable  filter  which  is  used  in  the  ’forward'  path  in  Figure  la.  A 


trigonometric  expansion  of  wY(t)  yields  terms  involving  sin  or  cos  of  2uYt . 


If  we  retain  these  terms,  then  their  effects  would  drop  out  below  when  lim  is 

Y-0 


taken. 


So,  for  convenience,  we  expand  wr(t),  drop  these  'high  frequency' 
Y, 


terms  and  replace  w'(t)  by 

uY(t)  =  AY(t)  sin(9-eV2  +  [zY(t)  cos  9Y  -  zY(t)  sin0Y]/2. 


b(x)  =  { 


Dv  +  HA  sin(0-0)/2 


Cv 


1 

(x) 

II 

b2(x) 

Then  the  H-functional  for  (6.1a)  is  (note  that  the  appropriate  scaling  is  q^, 
not  Y  here)  ' 

r1/qY 

H(x,a)  =  a'b(x)  +  lim  q  log  E  exp  a'H/2 

Y  1 


Y~*-0 


0 


[  z  ^  (t ) cos  0  -  z9(t)sin  9  ] dt 


(6.2) 


(<*'h)2  _? 

=  ct'b(x)  +  — t- —  a 


_2 

a 


p(s)ds. 


This  is  also  the  H-functional  for  the  system 

dv  =  b1(v)dt  +  H  oVq^dw, 
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where  w(*)  is  a  standard  Wiener  process.  Since  w^(t)  = 

|  [zj(s)  cos  9^  -  z^fs)  sin  9Y]ds/2/q^"  converges  weakly  to  a  Wiener  process 
—  —2 

w(-)  with  variance  a  t,  the  small  'white  noise'  approximation  to  (6.1)  makes 
sense  here.  But  we  are  not  aware  of  a  proof  that  H(x,a)  actually  gives  an  action  function¬ 
al  and  the  exit  time  formulas  (1.6)  to  (1.8), for  the  systems  of  (6.1  a  or  b)  where  the 
normalizing  factor  y  is  replaced  by  q^.  Possibly  such  a  proof  can  be  based  on 
Azencott  [1]  for  this  purely  Gaussian  process.  In  any  case,  it  is  not  adequate 

to  simply  proceed  from  there,  without  some  sort  of  limit  or  approximation 
argument . 

Although  wY(*)  converges  weakly  to  w(»),  if  the  z^(*)  are  only  (sufficiently) 

strongly  mixing  but  not  Gaussian,  the  H-functionals  are  not  usually  of  the 

form  (6.2).  Suppose  that  nY(*)  was  obtained  from 

an  impulsive  or  shot  noise  process  which  was  suitably  filtered  in  order  to 

guarantee  that  the  actual  noise  entering  the  system  have  bandwidth  0(l/q^) 

and  center  frequency  0(l/y).  Rough  calculations  similar  to  those  in  Section  5.4 

suggest  that  the  limit  would  take  the  form  (6.2)  under  reasonable  conditions. 

Such  a  result  would  be  quite  useful  in  applications;  in  many  cases,  such 

shot  noise  based  processes  are  closer  to  the  true  physical  noise  than  is  the 

Y  5 

Gaussian  noise.  It  would  also  be  interesting  to  work  with  n  (t)/q^,  for  some 
6  <  1/2  and  use  Freidlin's  idea  of  moderate  deviations  [5]. 

The  PLL  systems  considered  above  are  an  important  class  of  applications 
to  which  large  deviations  or  singular  perturbation  methods  have  been  applied  [12],  [16], 

although  it  is  now  common  practice  to  ignore  the  limit  and  approximation  questions, 

and  even  the  (usual)  'pass-band*  nature  of  the  PLL  in  order  to  write  down  a 

'small'  noise  Ito  equation  model  directly,  and  allow  the  analysis  to  start  from  there. 


Let  0(t)  =  eQ.  The  mean  equation  is  x  =  b(x),  and  for  the  usual  filters, 

(0, 9q)  =  Xq  is  a  locally  asymptotically  stable  point  of  this  equation.  For  the 
simple  ’first  order'  PLL  of  Fig.  1,  there  is  no  filter  and  the  limit  equation  is 


0  *  KA  sin(0-0)/2,  where  K>0  is  a  scalar.  An  important  communications  theory 

problem  is  to  estimate  the  minimum  time  for  (v(t),  9(t))  or  6(t)  to  leave  the 
stability  set  of  the  limit  equation.  Owing  to  the  difficulty  of  the  problem, 
and  to  the  fact  that  the  noise  is  often  'rapidly  fluctuating'  and  with  'small' 
effects,  'small  noise'  methods  are  appealing.  Above,  we  have  given  an  outline 
of  the  role  of  the  theory  of  large  deviations.  But  for  the  actual  physical 
non-white  noise  model,  a  number  of  questions  concerning  modelling  and  approxi¬ 
mation  of  the  noise,  and  proof  of  the  escapetime  formula  (1.8)  still  remain 


Fig  1.  Phase  Locked  Loops 
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Appendix 

The  proof  of  (1.6)  in  [5]  is  not  quite  valid  for  (1.2), (1.4), 

since  {pn>  is  unbounded.  The  proofs  in  [3]  do  not  account  for  the  ^  or£(*) 
terms.  If  o(x)  =  a,  the  proofs  given  or  referenced  in  [5]  remains  valid, 
with  a  few  modifications.  Here,  we  remark  on  the  required  changes,  without 
proofs.  For  concreteness,  the  discrete  parameter  case  only  will  be  dealt  with. 

The  set  {<j> :  -^(<P)  la,  <p  (0)  *  x)  on  top  of  p.136  [5]  is  still 

compact  in  the  unbounded  U(x)  case,  by  (A4.5).  Lemma  3.1  of  [5]  requires  a  few 
modifications,  since{pn)  is  unbounded.  The  inequality  below  (3.2),p.  138  [5]  is 
no  longer  true,  but  it  does  hold  modulo  the  probability  of  a  set  A^  ^ where 
P{A  }  1  exp-N/e,  where  N  can  be  made  as  large  as  we  wish  by  choosing  e,A 
small  enough.  Similarly,  the  set  inclusion  below  fS,  p. 138J  holds  modulo 
a  set  of  probability  <_  exp  -N/e,  where  N  -*•  «>  as  A  0,  e  -*•  0.  With  these 
changes  Lemma  3.1  of  [5]  holds. 

Proof  of  Theorem  2.1  [5].  If  a(x)  =  constant,  the  last  set  inclusion  on 
[5,  p.  141]  holds  by  the  uniform  Lipschitz  condition  on  b(*,£).  Concerning 
the  argument  on  p.  142  (5],  the  trajectories  xe’V),  xe(*),do  not  belong 
to  a  compact  set.  But  for  any  large  N,  there  is  a  set  of  probability  > 

1-exp-N/e  such  that  on  this  set  the  trajectories  do  belong  to  a  compact  set  of 
continuous  functions-  they  satisfy  a  common  holder  condition. 
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